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Abstract 

The goal of this paper is to survey the properties of the eigenvalue relaxation for least squares binary 
problems. This relaxation is a convex program which is obtained as the Lagrangian dual of the original prob- 
lem with an implicit compact constraint and as such, is a convex problem with polynomial time complexity. 
Moreover, as a main pratical advantage of this relaxation over the standard Semi-Definite Programming 
approach, several efficient bundle methods are available for this problem allowing to address problems of 
very large dimension. The necessary tools from convex analysis are recalled and shown at work for handling 
the problem of exactness of this relaxation. Two applications are described. The first one is the problem 
of binary image reconstruction and the second is the problem of multiuser detection in CDMA systems. 
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I — r 1 Introduction 



Several problems in engineering and in particular signal and image processing necessitate to estimate binary 
vectors corrupted by some noise and can be simply addressed using the least squares principle under binarity 
^s^j , consraints. The resulting problem is a minimization of a quadratic form over {—1,1}", a problem which is 
CO ' known to be iVP-Hard in general. One of the main approaches to relax this problem into a convex one is the 
' Semi-Definite Programming relaxation which has been extensively used in classification, pattern recognition 
^Nj and communication systems. Some of the main achievements in the study of the SDP relaxation were obtained 
^ by Goemans and Williamson [13j and |11| . However, solving a SemiDefinite Program in practice relies on 
■ interior point methods which although enjoying nice theoretical convergence properties are limited to problems 
of size up to 500 x500. On the other hand, very pratically efficient bimdle methods are available for the 
eigenvalue relaxation of the same binary quadratic optimization problems. We refer the reader to [T] for a 
discussion of the pratical superiority of bundle methods for solving certain semi-definite programs such as the 
ones appearing in the present paper. Despite this empirical fact in favor of the eigenvalue relaxation, one of 
the main reasons most users prefer the SDP relaxation is that good primal binary solutions can be recovered 
using Goemans and Williamson's randomized algorithm. The main motivation of the present paper is to show 
how a solution of the SDP can be recovered from a solution of the eigenvalue relaxation. As a by product, a 
new geometric interpretation of the randomized algorithm is proposed. 

Penalized binary least squares estimation problems are problems of the form 

mm \\y - AxW^ + lyx* Px s.t. x G {-1, 1}", (1.0.1) 

where the vector y S R™ is the observed data, the matrix A £ ]]j™x" represents the "filter" , the vector x G R" 
is the signal, or parameter vector, that has to be estimated, and the term vx^Px is a penalization term that 
can often be interpreted as an a priori information in terms of Bayesian statistics. 

This problem belongs to the larger class of minimization of quadratic forms over binary vectors which is 
known to be A^P-hard. Much work has been devoted to constructing Semi Definite Programming (SDP) based 
relaxations for general quadratic binary problems. Semi-Definite programs are linear optimization problems over 
symmetric matrices with real coefficients and with the additional convex constraint of positive semidefiniteness; 
see for instance |6j or |2J for excellent introductions to convex programming and in particular SDP. SDP methods 
have already played an important role in various topics inside signal processing problems and we refer to [5] 
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for a nice survey on possible applications. A common feature of essentially all the existing relaxations is that 
they can be obtained using Lagrange duality which is a general methodology for obtaining lower bounds to 
hard minimization problems, as overviewed in [4] and [5]. 

The goal of the paper is to survey what is known about another Lagrangian duality based relaxation, 
namely the eigenvalue relaxation, for this problem. This relaxation was first proposed by Delorme and Poljak 
[19] for the max-cut problem. See also the work of Poljak, Rendl and Wolkowics [7] for more details. The 
main advantage of the eigenvalue relaxation over the SDP relaxation is that the eigenvalue relaxation can be 
solved much faster than the SDP relaxation, as reported for instance in [T], [^, [Hj and [TD]. This remarkable 
computational tractability of the eigenvalue relaxation is the main motivation for writing this detailed survey. 

The content of the paper is as follows. The second section is devoted to a rapid presentation of the relaxation 
and its relationship with Lagrangian duality. We also recall a simple and well known certificate for exactness 
of the relaxation, i.e. the fact that a globally optimal binary solution is obtained. 

The third section details the relationships between the Semi-Definite Relaxation and the eigenvalue relax- 
ation. The main result of this section is the following: a solution of the SDP relaxation can be recovered from 
the solution of the eigenvalue relaxation. The case of inexact solutions to the eigenvalue relaxation is also 
studied. 

The forth section deals with the problem of recovering binary primal solutions from the dual scheme. We 
first give sufficient conditions under which strong duality holds and the eigenvectors of norm n+1 associated 
to the maximum eigenvalue at optimality are binary solutions. Next, in the case where strong duality does not 
hold, we show that Goemans and Williamson's randomized algorithm has a very natural meaning when viewed 
in terms of the optimal eigenspace associated to the maximal eigenvalue in the eigenvalue relaxation. 

In the last section, we propose simulation experiments in the case of binary image denoising and CDMA 
Multiuser Detection problems. The first of these problems has been previously approached by stochastic 
methods based on Markov chains like simulated annealing and Metropolis Hastings schemes; see for instance 
[It] and the more recent work of Gibbs [M]. The approach discussed here was presented in [15]. Recently 
a lot more problems have been addressed using the SDP relaxation in [12]. The results obtained so far are 
quite encouraging and the approach performs well on very dirty images. We prove hat strong duality holds 
for the immage denoising problem, thus recovering back the polynomial solvability result of Greig, Porteous 
and Seheult as a special case and by a very different path. Passing to the second problem, our Monte Carlo 
experiments show that the average computational effort for solving the eigenvalue relaxation as a function of 
the number of users grows slowlier than for the SDP relaxation with the standard implementations available 
with scilab. 

Notations. In the sequel we will use the following notations. The inner product on M" is denoted by (•,•), 
the set of real symmetric matrices of order n are denoted by §„• The partial order ^ denotes the Loewner 
ordering, i.e. for A and i? in §„, ^ )== B means that A — B is positive semidefinite. For a set S in R", conv(5') 
denotes the convex hull of S and S denotes its closure. For a matrix A in S„, d{A) denotes its diagonal vector 
and for a in R", D{a) denotes the diagonal matrix whose diagonal vector is a. If an equation number # 
corresponds to an optimization problem, then opt(^) will denote the optimum value for this problem. 



2 The eigenvalue relaxation 

Wc first introduce the eigenvalue relaxation and at the same time, we propose a quick refresher on Lagrangian 
duality, collecting all the results that will play an essential role in the sequel. The proofs of almost all the 
results presented here can be found in [l8] . 

2.1 The Lagrangian dual and the eigenvalue relaxation 

The binary least-squares estimation problem is in fact equivalent to the homogenized problem 

max^-gn^+i - a;* 

Indeed, if we add the constraint a;„-|_i = 1 in (BLS), we obtain exactly the binary least squares problem. Now, 
if X* is a solution of (BLS), then —a;* is again a solution of of (BLS), thus adding the constraint x„_|-i = 1 is 
in fact redundant, which proves the claimed equivalence. Set 

^\ A' A + vP -A'y ' 



A^A + uP 



-A'y 

y*y 



X s.t. X e {—1, 1} 



n+1 



(BLS) 



Notice further that the constraint Xi e { — 1,1} is equivalent to = 1 for all i — 1, ■ ■ ■ ,ji + 1. Thus, to 
problem (BLS), we can associate the Lagrangian function 

L(x, u) = -x*Mx + -Z"^^' y^riA - 1) 
= x^{D(u) - M)x - it*e. 

Now we can add to the problem the implicit spherical constraint 

5„+i = {xe M"+^ I x*x = n + 1}, 

which is redundant with the binary constraints. Then, optimizing over this sphere, we obtain the Lagrangian 
dual function, i.e. 

6{u) = ma,Xj;^s„+iX* {D{u) — M)x — u*e 

= maxa;e5„^i x\D{u) - M)x - j^x^x 
= max^es^^i x* (^D{u) - M - ^I'jx 

which, using Ralcigh-Ritz variational formulation of the largest eigenvalue of symmetric matrices, can be written 

0{u) = (n + l)A,nax(^(«) -M — / . (2.1.1) 

V n + 1 / 

Finally, the dual problem, i.e. the eigenvalue relaxation, is given by 

min 9{u). (2.1.2) 

2.2 Properties of the dual relaxation 

2.2.1 Convexity 

It is important to notice first that the dual function 0{u) is convex, since it is the maximum over a family 
parametrized by x € Sn+i of linear functions in the variable u. 

2.2.2 Weak duality 

The main classical property of the Lagrangian dual is weak duality, i.e. 

min e(u) > opt(BLS), 

where opt denotes the optimal value. 

This property explains in part why Lagrange duality is used : it provides a bound on the primal optimal 
value. When equality holds in the weak duality property, we say that strong duality holds. Sometimes, like in 
the case of the Max-Cut problem, the bound can be proved to be proportional to the optimal original value. 
More precisely, Goemans and Williamson proved that the optimum value of the eigenvalue relaxation (in fact 
the equivalent SDP formulation; see the original paper and Section [3] below) is greater than or equal to the 
optimal original value (this is just weak duality), which itself is always greater than or equal to .876 times the 
eigenvalue relaxation's optimal value. A quite similar but less tight bound, proved by Nesterov applies directly 
to the present problem. We will recall this bound in section [4. 2. II below. 



2.2.3 Existence of dual solutions 

It is well known that there exists an optimal dual solution. This was proved by Poljak and Wolkowicz in [20] . 
The proof given here is more direct. 

Proposition 2.2.1 The dual function admits a minimizer. 

Proof. Let 9* = inf„£Rn+i 9{u). Make the change of variable v = u — ^^^j T^ll=i '"ii i-^- define 

T^[v) = (n + l)\raUD{v) - M) = e{u). 

We now have the property that Y^^=i — 0- We prove that 77 is coercive. Take any sequence {v'^)k&i 
with ||ufe|| — > -\-co as fc — !■ +00. We can assume that — > +00 for some i because otherwise, the fact that 
\\vk\\ +00 implies that there must exists a sequence (w|)fcgN with v'^ — ^ —00 and the fact that "Y^^^i Vi = 



gives a contradiction. Now, the Gershgorin circle around the diagonal element Af^.i + Tjf has a constant radius, 
say r and its center goes to +00. Since \Mi^i+vf — Xinax{D{v) — M)\, this implies that Amax(£'(u) — M) — > +00. 
Thus r] is coercive and since it is continuous, it admits a minimizer that we will denote by v* . Now, for all 
u gM., V ^ u - Zl"=i"^ '"'i^ we have 

r < 9{v*) 

But, on the other hand, 9{v*) = r]{v*) < r]{v) = 9{v) ~ 9{u). Therefore, 

9{v*) < 9* 

and the proof is complete. □ 



2.2.4 SubdifFerential's description and exactness criterion 

The subdifFerential d9(u) of the eigenvalue relaxation has been much studied. Recall that for any convex 
function / : i-^- R, the subdifferential df{u) is defined by 

df{u) = {g e M™ I f{u') > f{u) + g\u' ~ u)]. 

The analysis of d9{u) is based on the following general theorem. 

Theorem 2.2.2 |18j Let A : R™ §„ be an ajfine operator defined by A{u) — Au + B for some linear operator 
A : K™ I— > §n o.nd some matrix i? € §„ . Then, we have 

a(A„,ax O A){u)) = A*d\r.Mn)) 

with 

E,na^[z e IZ'^O and trace(Z) - l}^;*,,, 

whcTc ^ is the adjoint of ^max denotes the multiplicity of A^iax ^ ^ o,nd Emax. 

is a matrix whose 

columns form any orthonormal basis of the eigenspace of X associated to Amax • 

Now, if we set A{u) = D{u)-M-^I, we get B = -A/, Au = D{u)-^I dMdA*X = d(X)-;j^trace(X)e. 
For den, let be defined by 

= |z e §d I Z ^ and trace(Z) = l|. 
Using the previous theorem, we obtain 

Corollary 2.2.3 The subdifferential d9{u) of the dual function 9 is given by 

d9{u) = {n + l)d(£;„iax-Zi;*,ax) " trace(£;max-Z£;^ax)e 

Following Oustry [5], the formula for 9Ainax(-'^) in theorem 12.2.21 is proved by showing that the maximum 
eigenvalue function Ainax(^) on S„ is nothing but the support function az^iX) of Zn, defined by 

az,SX) = sup {X,Z) 

with the scalar product defined by {X, Z) = trace(X, Z). By definition, the face Fz^{X) of -E„ exposed by X 
is the set of maximizers in (|2.2.ip . i.e. 

FzSx) = {ze z„ I A„ax(^) = {X, z)Y 

Knowing that the subdifferential of a support function of a set is exactly the exposed face of this set, we finally 
get 

aA,„ax(^) = {Ze Z„ I A„,ax(^) = {X, Z)} 

the formula follows after some linear algebra. 

There is a different path to the subdifferential's formula, which is perhaps more a propos in the context of 
duality: it is proved in T51 Chapter XII] that 

d9{u) = conv|(x? - 1, - • • ,xl^-^ - 1)* | L{x,u) = 9{u)^, (2.2.1) 



where conv denotes the closure of the convex hull. This fact is in fact true for general continuous constrained 
problems in the case where the underlying space is compact (for example) lj and the associated technical 
condition is called the filling property. The following proposition provides a useful sufficient condition for 
proving that the relaxation is exact, i.e. strong duality applies. 

Proposition 2.2.4 Let u* be a minimizer of the dual eigenvalue relaxation. Then, if Xr:na.x{^{u*)) has multi- 
plicity one, then 

min e(u*) ^ opt(BLS) 

and any eigenvector x of A{u*) whose squared norm is n + I is a binary solution of (BLS). 

The proof is a direct consequence of [TS] Theorem XII. 2. 3. 4.]. We provide a specialized proof here because it 
is short and instructive. 

Proof. Since the multiplicity of Aniax(^(u*)) is one, the subdifferential of Amax o A at u* is a single vector. 
Thus, 9 is differentiable at u* and its gradient is simply 

Ve{u*)^{xf ,xl+,^ ~ If 

for any x* in Sn+i such that 0{u*) = L(x*,u*). Since, u* minimizes 6, we must have V0{u*) = 0. This implies 
that x*i = 1 for alH = 1, • • • , n + 1. Thus, using weak duality 

opt(BLS) < 9{u*) = x*\-M)x* < opt(BLS) 

which proves that x* solves the original problem (BLS). □ 
We now have a nice criterion for deciding whether our relaxation was exact and if so, we also know how to 
recover a binary solution from an optimal eigenvector. This approach works for any quadratic binary problem 
and is extensively used for approximating combinatorial problems. However, the question remains on what to 
do when the relaxation is not exact, i.e. when the multiplicity at the optimum is greater than one. The next 
two sections will help answer this crucial question. 



3 From eigenvectors to SDP solutions 

The purpose of the next two sections is to describe how to recover primal binary solutions from the eigenvector 
solutions of the dual eigenvalue problem. It was first shown that good binary solution can be generated at 
random using the SDP solution by Goemans and Williamson [T3] in the case of the Max-Cut problem in graph 
theory. Their results were then extended by Nesterov to the case of indefinite quadratic binary programming 
[TT] . Those results allowed to conclude that both eigenvalue and SDP relaxations are in a certain precise sense 
very efficient. However, both relaxations are not equivalent from the computational point of view. Recall that 
one of the main motivations for using the eigenvalue relaxation is its manageable practical complexity which 
is often favorable compared to the one of solving the SDP relaxation. But what is not clear is how to generate 
good (primal) binary solutions in average with the eigenvalue relaxation only ? The first natural approach to 
this question is of course to try and recover an optimal SDP solution from the eigenvalue relaxation. Thus, 
we devote this section to this problem. It can be solved as follows : an appropriate convex combination of 
rank one matrices obtained from a set of optimal eigenvectors is shown to be a solution we are looking for. 
Our approach simplifies the presentation of [5T] . The adaptation of the randomized algorithm of Goemans and 
Williamson and the associated bound established by Nesterov will be discussed in the next section. 

3.1 The SDP relaxation 

In order to obtain the Semi-Definite Programming (SDP) relaxation of the the homogenized problem (BLS 
we begin with the following equivalence relating our problem to a problem on symmetric matrices. We have 

opt(BLS) ~ max trace(— Mxa;*) s.t. d(xx'') — e. 

This last problem is itself equivalent to 

max trace(-MX) s.t. d{X) = e, X ^ 0, rankX = 1. 

^ which is the case here since we optimize over the sphere iS„+i 
^Here, we use the fact that x^Mx = tra,ce{x^Mx) = trace(Ma;a;*) 



This problem being nonconvex, we drop the rank constraint and obtain the following SDP (convex) relaxation 



max trace(-MX) s.t. d{X) = e, X )p (SDP) 

whose value is obviously greater than or equal to val(BLS). 

An important result of Pataki [36l Theorem 2.1] gives a bound on the rank of solutions to Semi-Definite 
Programs. In the case of our Semi-Definite relaxation, this theorem implies that the rank r* of an optimal 
matrix X* satisfies ^r*{r* + 1) < n. 

3.2 SDP versus maximal eigenvalue : theoretical equivalence 

It follows from the subdifferential's formula given in Corollarv l2.2.3l that at any minimizer w*, we have 

e d0iu*) = 

Suppose we have in hand a matrix Z* G such that 

= (n + l)d{El^^^Z*E*^J) ~ iT^ce{E*^^^Z*E*^J)e. (3.2.1) 

It appears that a good guess for a candidate solution X* to the SDP relaxation in the general case is 

X* ^ {n + l)E^^^Z*E^^J'. 

We just need to check the details to see how it works. This result was initially proved in [21] but the proof 
given here is more direct. 

Theorem 3.2.1 JSTj Let u* be the optimal solution of the eigenvalue relaxation let E-^iayi be a matrix whose 
columns for an orthonormal basis of the eigenspace associated to Aniax(^(u*)) o-nd let Z* be as in I13.2.1\) . Then 
the matrix X* = (n + 1)E*^^^Z* E*^^^ is an optimal solution of the SDP relaxation. 

Remark 3.2.2 We would like to underline at this point that a more elegant proof of the theorem could be 
obtained using conic duality but we preferred to keep on with elementary arguments since this is possible in the 
present context. 

Proof. Compute the eigenvalue/eigenvector decomposition Z* = C/AJ7*, set F — E^^^U, 6 — d{A), let r be 
the multiplicity of A{u*) and let /i, • • • , denote the columns of F. Recall that from the definition of Z*, we 
have J2^j=i '^j = 1- Then, we get 

= d(FAF*) - --^trace(i^Ai^*)e. 

Thus, 

trace ((£>(«*) - :^{u*yeI)FAF*^ = 
(u*)*d(FAF*) - (M*)*;;^trace(FAi^*)e = 0. 

Using this fact, we obtain 

trace(-MX*) 

= {n + l)trace((-M -I- D{u*) - ;^(u*)*e/)FAF*) 

= {n + l)tiSice{A{u*)FAF*) 

= {n + l)trace(A(«*) E;=i S,f,fj) 

= {n+l)Y.U5jIlA{u*)f, 

= (" + 1)E;'=i'^jW(^(^^*)) 

= (n+l)A,nax(v4(M*)), 

since = 1- Thus, the optimal value of the SDP is greater than or equal to the optimal value of the 

eigenvalue relaxation. On the other hand, it is well known that the optimal value of the eigenvalue relaxation 
is greater than or equal to the one of the SDP relaxation. We provide a proof here for the sake of completeness. 
Let X** be an optimal solution to the SDP relaxation. Now, for all u in M"+^, we have 

tracef X**(D(u) - -^I)) - 
V n + 1 / 



by using the fact that D{X**) = e. Now, compute the eigenvalue/eigenvector decomposition —M + D{u) — 
^q^/ = Y^l^i Kvivl and let Amax be the greatest of these eigenvahies. Then, 

trace(-MX**) = trace(x**(-A'f + D{u) - ^/)) 

= A„axtrace(X**Er=i'"»"*) 
= A,„axtrace(X**/) 

= (n + l)A,„ax 

Since this is true for aU u, we obtain that the eigenvalue relaxation majorates the SDP relaxation. Thus, both 
optimal values are equal and this completes the proof of the proposition. □ 



3.3 SDP versus maximal eigenvalue: practical implementation 

Of course, it can be hard to find a matrix Z* e Zr^_^^S that works. We will now try to overcome this problem. 
We first have to specify how the subgradients are obtained in practice. At each point u e ]R"+^, choose an 
eigenvector x of squared norm equal to n + 1 associated to Amax(^(w))- Then, using the alternative represen- 
tation of the subdifferential (I2.2.ip . a subgradient of at u is obtained by setting g = [xi^ — 1, ... , — 1]*. 

2 2 

Assume that we have a set of subgradients gj = [x{ — 1, ... , xl^_^_l — 1]* G d6{u^) for some u-', j — 1, . . . ,p 
and such that 

p 

Il0~^a,.9,|| <e, (eOPT) 
i=i 

for some nonnegative a^-'s with X]j=i "^i ~ 1- This can be performed for e as small as we want by using a 
bundle method. Such a method will construct in a finite number of iterations, say fc, an iterate and a family 
of u-^ 's with the desired property, all of them lying in a small neighborhood of . This is one very nice feature 
of the bundle mechanism which is extensively described in |18[ Volume II]. Moreover, it is a well known fact, 
called Caratheodory's theorem, that only p — n + 2 subgradients are sufficient in the expression (eOPT). 
Set 

Then, we have the following result. 

Proposition 3.3.1 For any e > 0, the matrix X* defined above satisfies 

trace(MX;) < min e{u)-0{e). 

Proof. Let u* be any minimizer of 9. Then, for each j = 1, . . . , p, we have by the definition of the subdifferential 

9{u*) > 6{u^)+g]{u* - u^). 

But 0{w>) is given by 

e(u^) = x'UdM) - M - ^^l]x^. 
V n + 1 / 

On the other hand, since x^^ x^ — n -\- 1^ 



M ~ !^I]x^ 

n+1 



= x^'Mx^ + j::+lu,xi ^^Y^tiy^^ 

= x^'MxI + u^{xf - 1) 

= x^ Mx^ + g*jU^ ■ 

Thus, we obtain 

e{u*) > trace(M2;-''*a:-'') + 
which implies, after multiplying by aj and summing over j = 1, . . . ,p 

p 

e{u*) > trace(MX;) + (^ajg^fu*. 



Using Cauchy-Schwartz inequality, this gives 

e{u*)> trace(MX*) + e||w*||. 
Since the eigenvalue and the SDP relaxation have equal optimal values, we finally obtain 

opt(SDP) > trace(MX;) +e||ii*|| 
which implies the desired result. □ 

3.4 Comments 

It is a common idea that the SDP relaxation contains more information than the eigenvalue relaxation. We 
hope that the results of this section managed to convince the reader that this is in fact not the case and a good 
approximate solution can be recovered quite easily using subgradient information at the optimum. 

4 Recovering primal binary solutions 

We now are in position to answer our main question of how to recover a satisfactory although sometimes 
suboptimal primal binary solution. In the first part of this section, we show that optimal binary solutions 
can actually be exactly recovered using the eigenvalue relaxation, i.e. strong duality holds, under some simple 
conditions. Then, in the case where the problem does not satisfy these necessary conditions for strong duality, 
we develop a randomized algorithm based on the optimal eigenspace of the maximum eigenvalue dual function 
and show that this procedure is equivalent to Goemans and Williamson's randomized algorithm for Max-Cut. 
This provides a new interpretation of Goemans and Williamson's procedure. 



4.1 A sufficient conditions for strong duality 

We have the following theorem. 

Theorem 4.1.1 For almost all A in the sense of the Lebesgue measure, such that A^A + vP is componentwise 
negative outside the diagonal. Then the eigenvalue relaxation is exact, i.e. strong duality holds. 

Proof. Fix u G M."^-^. Let m" be the vector of the first n components of u. The fact that A* A + vP is 
componentwise negative outside the diagonal implies that —A*' A — vP + D{ui) — min(u")/ is componentwise 
positive. Thus, the Perron-Frobenius theorem implies that the maximum eigenvalue of —A* A — vP + D{ui) — 
min(u")/ has multiplicity one. From this, we deduce that the maximum eigenvalue of —A* A — vP + D{u") 
also has multiplicity one. Let Vu^Du^V*-a be an eigenvalue decomposition of A^A + vP + D{ui), where we 
used the subscript it" in order to remember that whatever the chosen decomposition, it is a nonlinear and non 
necessarily continuous function of u. Moreover, since the maximum eigenvalue has multiplicity one, Corollary 4 
in [22] says that it is possible to choose the eigenvector associated to the maximum eigenvalue as a continuously 
differentiablc function of u". We will denote by v™^^ this eigenvector. Using this parametrization, the matrix 



M + D(u) 



A* A + vP 

-y'A 



-A'y 

y^y 



D{u) 



can be rewritten as 



M + D{u) 











where all dimensions can easily be guessed from the previous knowledge on the involved submatrices. 
Let V be the codimension one differentiablc submanifold defined by 



Let W be the optimal set defined by 

W = {{A,u) e 



Due to the representation 
de{u) ={K, 



t 

max 



I y'Av^.^- = 0}. 
e de{u)}. 

,ue M"+i, V e R("+i)x'-m'.x, z eSr , ^ ^ 0, 



{~M + D{u))Vn,,^ = AKiax, ^^axKiax = /, trace(Z) = 1}, 



the set W is the projection onto the cartesian product {{A, u) e ][j™x" x K"+^} of the set U^^^iWr where R is 
the upper bound of Pataki (see Section 13. ip on the optimal rank of the SDP relaxatiord (here R < \/2n for n 
large) and where Wr is the set 

Wr ={{A,u,v,x,z) I A€iR"^",iteR"+\ yeM("+l)><^ zeSr, {-m + d{u))v = xv, 

V*V = /, trace(Z) = 1, {n + l)d{VZV^) + iv&cc{VZV^)e = 0}, 

whose intersection with {{A, u, V,X,Z)\ Ae M™^", u e R"+\ V E M"^'', Z eSr, Z ^0} corresponds to the 
parameter set allowing for zero to belong to the subdifferential of the dual function 9 in the case where u = 
Xniaxi—M + D{u)). Now, since the constraint {—M + D{u))V — XV is described by {N + l)r equations, V^V — I 
by T'-^^-j-^ equations, trace(Z) = 1 by one equation and {n+ l)d{V ZV*) + tva,ce{V ZV*^)e = 0, the dimension of 
Wr is greater than or equal to mx n+(n+l) + (n+l) x r+l+rx '"''^^^ — {n+1) xr—rx -^^-ji^ — l + (n+l) = mxn. 
Furthermore, notice that since the eigenvalues are continuous fonctions of the entries of ~M + D{u), the subset 
of y^r'^iWr for which u = Aniax(— + D{u)) is open in the topology induced by the ambiant space. Therefore 
its projection set onto the cartesian product {{A^u) g ]X"ix" x R"+^} is of dimension at least mxn which 
garantees that the projection onto the yl-space {A g M™^"} of its intersection with V is a set of null Lebesgue 
measure. And thus, for almost all A, such that A*' A + vP is componentwise negative outside the diagonal, 
y^Av'^f'' ^ 0. 

Using this result. Theorem A about the interlacing property of the eigenvalues for arrow matrices in the 
Appendix implies that the maximum eigenvalue of M + D{u) is greater than the maximum diagonal element 
of Du^ which nothing by Aniax(— (^*^ + vP) + I?(m")) and all n other eigenvalues are less than Aniax(— (^'^ + 
vP) + D{u")). This implies that for allmost all A, the maximum eigenvalue of M + D{u) has multiplicity one 
at the optimum, which implies that 9 is differentiable at the optimum. Therefore, using Proposition 12.2.^ we 
obtain that strong duality holds for allmost all A such that A*A + vP is componentwise negative outside the 
diagonal. □ 

4.2 When strong duality fails: the randomized algorithm 

We start this section with some recalls on Goemans and Williamson's algorithm and Nesterov's bound. 
4.2.1 Goemans and Williamson's algorithm and Nesterov's bound 

The method relies on the Cholesky factorization of the optimal solution AT* of the SDP relaxation, 

X* = V^V. 

From Theorem 13 . 2 . II we see that V £ K("+i)x»'max -vvhere r,„ax is the multiplicity of Amax(^(w*)) at the chosen 
corresponding solution u* of the eigenvalue relaxation. This factorization is important, since it allows to write 
X*j = vjvj where Vi is the transpose of i*'' row vector of V. Let ^ be a random variable with uniform distribution 
on the unit sphere in . 

Procedure 4.2.1 (Goemans and Williamson's algorithm) 

1. Find the Cholesky factorization X* — V*V. 

Let C, he a random vector with uniform distribution on the unit sphere of S{Q, 1). The random cut is defined 

by 

Z^s\gn{y\). 

where the sign function is defined coordinate-wise. 

2. Draw n samples from Z, say , z" and choose the sample giving the best value of the objective 
function z^Mz. 

The key result is that, in average, the vector Z gives a good binary solution to the original problem. Since the 
best sample will have greater cut value than the average with overwhelming probability, the above procedure 
should work well. This is made precise by Nesterov's theorem. 

Theorem 4.2.2 (Nesterov) Define 

f* = max x^Mx s.t. x e {-1,1}"+^ 



which also holds for the eigenvalue relaxation due to the complete equivalence between these two problems 



and 



L = min x*Mx s.t. x e {-1,1}"+^ 

then, we have 

f*-E[z*Mz] 2 

r-f* - ^' 

This result is remarkable despite the fact that the bound ^ is rather large. An important issue for future 
research is to study such type of bounds for particular subclasses of problems in hope of improving Nesterov's 
result. 



4.2.2 The eigenvector viewpoint 

The main drawback of the former presentation is that using the uniform variable ^ is quite hard to motivate 
from an optimization viewpoint. Let us take a slightly different perspective. Assume that we have a solution 
u* of the eigenvalue relaxation. As before, let -Bmax be a matrix whose columns form an orthonormal bases of 
the eigenspace associated to Xmaxi^iu*)). Moreover, we may require that 

O^A*{E^,^AEl,J, (4.2.1) 

where A is some diagonal matrix with a = d(A), a > and J2l=T Q^i = 1- In the case where the multiplicity 
at the optimum is one, the optimal eigenbasis reduces to a unique vector and we saw in Proposition 12. 2.^ that 
multiplying this vector by v^rT+T gives a binary solution. Now let us turn to the case where there are Tmax > 1 
eigenvectors. To each unit norm eigenvector e-', we associate a subgradient gj = [{n + l)(ej)^ — 1, . . . , (n + 
l)(el+i)^ - !]*■ Then, (liXT]) implies that 



= >^ "jffj- 



Now one natural strategy might be the following: pick the best eigenvector, i.e. the eigenvector Vn+Te^° 
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Figure 1: Three subgradients in at the optimal dual solution, one convex combination of which gives zero. 

whose associated coefficient aj„ in expression (|4.2.2p is the greatest and round its coordinates to the nearest 
binary values. There is a second strategy : draw random linear combinations of the y^n + le^ 's giving preference 
to the components with higher associated coefficient in (|4.2.2p . This can be done by sampling vectors of the 
type 

CjVnTTe^ 

j=i 

where the Q^s are independent random variables with distribution A/'(0, aj). For each sample, a feasible solution 
is obtained by rounding off the components to the nearest binary. We sum up this procedure as follows. 

Procedure 4.2.3 (Randomized algorithm based on optimal eigenvectors) 1. Find the matrix Smax 
whose columns form an orthonormal eigenbasis associated to Amax(^(u*)) such that {^.2.2\j holds for some aj 's 
satisfying a > and jyj=T '^o ~ ^■ 

2. Let C, he a random vector with distribution Af{0, D (a)). The random cut is defined by 



3ign(^\/n+ 1-BmaxC)- 



3. Draw n samples from Z , say Z^ , . . . , Z" and choose the sample giving the best value of the objective 
function z^Mz. 



The important result is that this second strategy is equivalent to Goemans and Williamson's randomized 
procedure. 



Proposition 4.2.4 Procedure \4.2.3\ is equivalent to Goemans and Williamson's algorithm. 

Proof. Set W = E^^^D{a)i . Then Theorem EHH and equation imply that X* = V*V with V* = W, 
thus retrieving the Cholesky factorization of X*. Let ^ = D{a)^^C. It is clear that ^ has distribution JV{Q,I). 
This proves that the cut Z obtained by Procedure 14.2.31 is exactly the output of Goemans and Williamson's 
procedure. □ 
The eigenvalue point of view thus allowed us to provide an alternative and geometric explanation for taking 
a random cut using a uniformly distributed variable on the sphere in Goemans and Williamson's methodology. 



5 Two application examples 

In this section, we provide some results for the concrete problems of image denoising and show how this 
relaxation applies to the problem of multiuser detection in CDMA systems. 

5.1 Image denoising 

5.1.1 Presentation of the problem 

The first set of simulations is devoted to the denoising problem, in which A is simply the identity matrix. 
This is the problem considered in f23', 14^ and TT\ for instance. The original binary image as 26 rows and 62 
columns which gives a total number of 1612 variables. 

For this problem, the penalization matrix P is chosen so as to smooth the image. This is achieved by 
requiring neighboring pixels to be similar in the sense that if i and j are indices of neighbor pixels, then, we 
would like the least square cost to be penalized by the quantity \xi — Xj\'^ . Thus, P is the matrix associated to 
the quadratic form 

Y.Q,\x,-x,\\ (5.1.1) 

where z ~ j denotes the property of being neighbor indices and the Qj are nonnegative. The neighborhood of 
each pixel is usually chosen to be the north, south, east and west pixels. 

5.1.2 Exactness of the relaxation 

The following theorem is the main result of this section. 

Theorem 5.1.1 For A — I . the identity matrix and P the matrix associated to the quadratic form \5.1.1\) . the 
eigenvalue relaxation is exact. 

Proof. The eigenvalue relaxation of the optimization problem corresponding to this binary least square de- 
noising problem is as before 

t 

e u 

min (n + l)Amax(-(M + uP + — /) + DiuV')). (Denoise) 

ueR"+i n + 1 

Consider now the perturbed optimization problem 

e^u 

min (n + l)Amax(-(M + AM H /) + !?«)+) (Perturbed) 

«GR"+i n + 1 

where AM is negative outside the diagonal. Since the Qj are nonnegative, the matrix P has only nonpositive 
off diagonal terms and thus. Theorem 14.1.11 proves that strong duality holds for this problem and there exists 
a binary eigenvector that achieves optimality. Assume that AM is chosen so that ||AAf|| < e. Then, the 
optimum value 9* of problem (Denoise) and the optimum value of problem (Perturbed) satisfy 

Moreover, by weak duality, we have 

max -x\l + vP)x < 9\j..j. 

a;e{-l,l}" 



Since strong duality holds for problem (Perturbed), denoting by x\j^,j a solution of max^g{_]^ j^jn — + AM + 
vP)x we have 

^AM = -2;AAf*(^ + + ^P)^*AM < max -x\l + vP)x. 

re{-i,i}" 

Therefore, we obtain 

- x\m\I + + '^^)a;Ai\/ < max -x\l + < -x\j,j\l + AA-f + vP)x\m + {n + l)e, 

a:6{-14}" 

which implies 

— x*A,/(I + vP)x\,j — {n + l)e < max ~x*(I + vP)x < —x*An/(I + vP)x\j,., + 2(n + l)e, 

a;e{-l,l}" 

Now, since { — 1, 1}" is finite, the image I of { — 1, 1}" by the function — + vP)x is a finite set. Let 5 denote 
the closest number to max3,g{_i ijn —x*{I + vP)x in I. Now, choosing 2(n + l)e < 5, we obtain 

— x*A,/ (I + vP)x\m — max ~x*{I + vP)x 
xe{-i.i}" 

which proves that the denoising problem is polynomial time solvable by solving problem (Perturbed). □ 
This theorem is to be compared with the results of D. M. Greig, B. T. Porteous and A. H. Seheult [51] which 
formulates the binary denoising problem as a minimization problem with cost given at the top of page 273. The 
objective to be minimized in [34] can be rearranged so as to minimize a linear cost with same penalization as the 
one given by (jS.l.ip . The main contribution of [34j is to say that this problem can be solved in polynomial time 
using a network flow algorithm. Notice that our proof works for A^A = and any additional linear term added 
to the penalized objective function to be optimized. Since the eigenvalue relaxation can also be optimized in 
polynomial time, this confirms that the eigenvalue relaxation performs at least as good as previous approaches 
on a well known problem. On the other hand, the eigenvalue relaxation can be a flexible approach in more 
complicated cases where A is not equal to the identify or other quadratic constraints have to be incorporated 
such as in [TQ. 

5.1.3 A numerical experiment 

The experiments reported on below were performed for the case of quite noisy original images. The noise 
was taken to be additive, independent identically distributed and Gaussian A/'(0, 2) and was applied to the 
symmetrized image with pixel values in {—1,1}. In order to show the influence of the smoothing parameter v, 
we displayed the percentage of misspecified bits vs values of v. The recovered image is the one with the choice 
of V giving the best percentage of bits recovered. 

We found the results very encouraging. Indeed, even when the observed image is very noisy, we still recover 
an image which is readable. This suggested that an appropriate postprocessing might easily allow to recover 
the original written words, by comparing the letters to a given dictionary. Cross validation can be used to 
estimate u. We will not discuss this problem here. Instead, it seems reasonable to argue that the choice of v can 
just be made a posteriori since it consists of tuning the method until a satisfactory solution is obtained. This 
reduces the hard combinatorial initial problem to a simpler one parameter knobing procedure. The displayed 
experiment and the numerous simulations not presented here confirm that robust intervals for the values of v 
are not very difficult to identify in practice. 

5.2 Multiuser detection in CDMA systems 
5.2.1 Presentation of the problem 

This problem was studied by |24] using the maximum likelihood approach. As we will see, the resulting 
optimization problem is of the same form as the binary least squares problem. The main difference here is that 
A ^ / and P = 0. 

A synchronous K users DS-CDMA system is considered with a common single path additive white Gaussian 
noise (AWGN) channel. The signature waveform of the fcth user is denoted by Sfc(i), a function taking nonzero 
values in [0, T] and being equal to zero outside this interval, and Xk is the information bit transmitted by user 
k. The overall received signal is therefore of the form 

K 

y{t) = ^ OkXkSkit) + n{t) 
fc=i 



where Ofc is the amphtude of the fcth user's signal and n{t) is an additive white Gaussian white noise with zero 
mean and variance a^. The signal y is then filtered using a bank of K matched filters. The output of the fcth 
matched filter is given by 

Uk = y{t)sk{t)dt. 
Jo 

In matrix form, this can be written 

y — RAx + V 

where y = [yi, ■ • ■ , R is the correlation matrix whose components are given by Rij — Si(t)sj(t)dt, 

A — D(a) and v is the vector with components i>k — Jq ''T-{t)sj{t)dt. 

Since the gaussian vector has a correlation matrix equal to a^R, the ML estimator is obtained by simply 
solving the following combinatorial optimization problem. 

miUjigRTi x'^ARAx — 2y*Ax 

(5.2.1) 

s.t. Xi e {—1, 1}, i — 1, . . . , K. 

5.2.2 Some comments 

The SDP approach seems to have been first applied for the DS-CDMA detection problem in Since then 
numerous contributions have appeared using the SDR and comparing it to other methods as in [28j and [29j . 
Extension to M-ary phase shift keying symbol constellations is proposed in [3D]. The issue of accelerating 
the speed of the method is addressed in [31j. However, as for the former problem, the main drawback of the 
standard primal semidefinite relaxation is that the size of the problem is greatly increased by using K x K 
matrices instead of vectors of size K. In order to overcome this problem, a better approach using semidefinite 
programming duality was recently proposed in [32| . 

The analysis of the previous sections proves that the eigenvalue relaxation is equally applicable to this 
problem and maybe a good competitor to the SDP relaxation. The most important point of our analysis is 
the following: Theorem 14.1.11 proves that if the correlation matrix R is componentwise negative outside the 
diagonal, then strong duality holds, i.e. the detection problem can be solved exactly in polynomial time. The 
construction of efficient signatures is the current subject of an active research activity. For instance, the theory 
of frames allows to consider the problem from an interesting viewpoint as developed in 35j . Our findings suggest 
in particular that the componentwise negativity of the correlation matrix may be an interesting constraint to 
look at in future investigations on this problem. 
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Figure 2: Three vectors in with correlation matrix having negative off-diagonal components. 

Finally, the eigenvalue relaxation can also be useful even for general signatures because of the weak duality 
property. Indeed, several recent publications prove that clever heuristics can perform better than the SDP 
relaxation. However, in real situations it is hard to certify that a primal solution provided by such a heuristic 
is indeed the optimal solution because the original signal is unknown. Comparing the dual optimal value to a 
primal value given by a heuristic can give a precise idea of the error without prior information on the signal. 

5.2.3 A numerical experiment 

In order to verify this point, we performed Monte Carlo simulations over 1000 random problems for a number 
a users varying from 10 to 35. These computational experiments are reported in Figure [7] where the number of 
users is on the x-axis and the average computation time is on the y-axis. The computations where performed 
using the Scilab software [33|. The SDP solver called S'emide/ interfaces Boyd and Vandenberghe's sp.c program. 
The eigenvalue relaxation was solved using the solver Optim with the "nd" option for possibly nondifferentiable 
costs as is the case here. The curves in Figure [7] interpolate the average computation times for messages taken 
to be sequences of uniform and independant variables taking values in {0, 1} vs. the number of users. The 
curve with dashed style is for the results of the SDP relaxation while the curve with plain style is for the 



eigenvalue relaxation. Our computations suggest that the eigenvalue relaxation has lower complexity growth 
as the number of users increases exactly as expected. The reader should be warned that this experiment does 
not prove that the complexity of the eigenvalue relaxation is lower than the SDP relaxation. The experiment 
only shows that when a widely used routine for SDP is used, the eigenvalue relaxation, solved using a general 
purpose bundle method available through a free a well established software, has a lower complexity growth on 
this problem. 

6 Appendix: Arrow matrices and strict interlacing of eigenvalues 

Arrow matrices are matrices A of the form 



The properties of the eigenvalues of such matrices have been well studied in the past. Some of them are 
summarized in the following theorem. Theorem A. Let A be an arrow matrix, with ai < 02 < . . . < 

Moreover, assume that all the components of b are different from zero. Let Ai < A2 < ... < A„+i be its 
eigenvalues considered in increasing order. Then, the characteristic polynomial of A is given by 



Then, we have Ai < oi and an < A„+i. Moreover ,if a, = aj+i we have a, = Aj+i = Oj+i and if ai < aj+i, we 
have ai < Aj+i < aj+i. 

The properties of the eigenvalues of arrow matrices are part of the folkore, especially in the realm of 
mathematical physics. We give a sketch of the proof of this theorm below in order to give the main ideas 
underlying the results. 

Proof of Theorem A. The formula for the characteristic polynomial (A) = det{A—XI) is easily obtained 
by reccurence on the dimension. We have to consider two cases: 

• for some i, ai = Oj+i, 

• ai < a2 < ■ ■ ■ < an 

In the first case a, is a root of pa- In the second case PA{ai) = YljjLiiO'j — ai)b^ which is different from zero 
since wc assumed all the bj's to be different from zero. In this case, the eigenvalues of A are the zeros of the 
function 



From this formula, we deduce that there is a root in each interval (—00, ai), (oj, flj+i), for alH = 1, . . . , n and 
(a„,+oo). 

The final conclusions are easily derived by combining the results in the two simple cases discussed above. 



7 Conclusion 

In this paper, we surveyed the main properties of the eigenvalue relaxation for binary least squares problem. A 
full connection with the standard SDP relaxation was presented and we showed how to recover a solution of the 
Semi-Definite program from the solution of the eigenvalue minimization problem. The problem of recovering 
primal binary solution was also addressed and we gave simple sufficient conditions for strong duality. In the case 
where these conditions are not satisfied, the randomized procedure adapted from Goemans and Williamson's 
allows to recover binary solutions with garanteed relative approximation ratio due to Nesterov's bound. Two 
applications were presented: binary image denoising and detection in multiuser CDMA systems. In the case 
of image denoising, we show that strong duality holds. For the multiuser detection problem, our results prove 
that strong duality holds when the signature covariance matrix has nonpositive off diagonal components. 



D{a) b 
^~ b* c 



n n 



Pa{X) = (c - A) n(«i - A) - E Hiaj - \)bl 




□ 
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Figure 3: Original image 




Figure 4: Noisy image: i.i.d. M{0, 2) 




Figure 5: Percentage of misspecified bits v.s. v 




Figure 6: Recovered image 




Figure 7: Comparison of SDP and eigenvalue relaxations for CDMA multiuser detection 



